Conjunto de datos de JUGADORES FIFA
El conjunto de datos contiene alrededor de 18.000 jugadores de la FIFA extraídos de sofifa.com
Tiene más de 88 variables (columnas).
Las columnas que usaremos en el análisis son:
age: Edad (Continua)
height_cm: Altura en cm (Continua)
weight_kgs: Peso en kg (Continua)
nationality: Nacionalidad (Categórica)
overall_rating: Calificación general (Categórica)
potential: Potencial (Categórica)
value_euro: Valor en euros (Continua)
wage_euro: Salario en euros (Continua)
preferred_foot: Pie preferido (Categórica)
international_reputation.1.5.: Reputación internacional
weak_foot.1.5.: Pie débil (Categórica)
skill_moves.1.5.: Habilidades de movimiento (Categórica)
work_rate: Índice de trabajo (Categórica)
body_type: Tipo de cuerpo (Lean, Stocky, etc.) (Categórica)
release_clause_euro: Cláusula de rescisión en euros (Continua)
club_team: Equipo del club (Categórica)
club_rating: Calificación del club (Continua)
club_position: Posición en el club (Categórica)
crossing: Cruce (Continua)
finishing: Finalización (Continua)
heading_accuracy: Precisión en los cabezazos (Continua)
short_passing: Pases cortos (Continua)
volleys: Voleas (Continua)
dribbling: Regateo (Continua)
curve: Curva (Continua)
freekick_accuracy: Precisión en tiros libres (Continua)
long_passing: Pases largos (Continua)
ball_control: Control de balón (Continua)
acceleration: Aceleración (Continua)
sprint_speed: Velocidad en sprints (Continua)
agility: Agilidad (Continua)
reactions: Reacciones (Continua)
balance: Balance (Continua)
shot_power: Potencia de tiro (Continua)
jumping: Salto (Continua)
stamina: Resistencia (Continua)
strength: Fuerza (Continua)
long_shots: Tiros largos (Continua)
aggression: Agresividad (Continua)
interceptions: Intercepciones (Continua)
positioning: Posicionamiento (Continua)
vision: Visión (Continua)
penalties: Penales (Continua)
composure: Composición (Continua)
marking: Marcaje (Continua)
standing_tackle: Entrada de pie (Continua)
sliding_tackle: Entrada deslizante (Continua)
GK_diving: Arqueo del portero (Continua)
GK_handling: Manejo del portero (Continua)
GK_kicking: Patada del portero (Continua)
GK_positioning: Posicionamiento del portero (Continua)
GK_reflexes: Reflejos del portero(Continua)
En total 9 categóricas y 43 continuas:
library(readr)
fifa_cleaned <- read_delim("fifa_cleaned.csv",
delim = ";", escape_double = FALSE, trim_ws = TRUE)
## Rows: 17954 Columns: 92
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ";"
## chr (45): name, full_name, birth_date, height_cm, positions, nationality, va...
## dbl (47): p, age, weight_kgs, overall_rating, potential, wage_euro, internat...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
fifa_cleaned=data.frame(fifa_cleaned)
head(fifa_cleaned)
## p name full_name birth_date age height_cm
## 1 158023 L. Messi Lionel Andrés Messi Cuccittini 24/06/1987 31 170.18
## 2 190460 C. Eriksen Christian Dannemann Eriksen 14/02/1992 27 154.94
## 3 195864 P. Pogba Paul Pogba 15/03/1993 25 190.5
## 4 198219 L. Insigne Lorenzo Insigne 4/06/1991 27 162.56
## 5 201024 K. Koulibaly Kalidou Koulibaly 20/06/1991 27 187.96
## 6 203376 V. van Dijk Virgil van Dijk 8/07/1991 27 193.04
## weight_kgs positions nationality overall_rating potential value_euro
## 1 72.1 CF,RW,ST Argentina 94 94 110500000.0
## 2 76.2 CAM,RM,CM Denmark 88 89 69500000.0
## 3 83.9 CM,CAM France 88 91 73000000.0
## 4 59.0 LW,ST Italy 88 88 62000000.0
## 5 88.9 CB Senegal 88 91 60000000.0
## 6 92.1 CB Netherlands 88 90 59500000.0
## wage_euro preferred_foot international_reputation.1.5. weak_foot.1.5.
## 1 565000 Left 5 4
## 2 205000 Right 3 5
## 3 255000 Right 4 4
## 4 165000 Right 3 4
## 5 135000 Right 3 3
## 6 215000 Right 3 3
## skill_moves.1.5. work_rate body_type release_clause_euro
## 1 4 Medium/ Low Messi 226500000.0
## 2 4 High/ Medium Lean 13.380.000.000.000.000
## 3 5 High/ Medium Normal 144200000.0
## 4 4 High/ Medium Normal 105400000.0
## 5 2 High/ High Normal 106500000.0
## 6 2 Medium/ Medium Normal 114500000.0
## club_team club_rating club_position club_jersey_number club_join_date
## 1 FC Barcelona 86 RW 10 1/07/2004
## 2 Tottenham Hotspur 83 LCM 23 30/08/2013
## 3 Manchester United 82 LCM 6 9/08/2016
## 4 Napoli 82 LS 24 1/07/2010
## 5 Napoli 82 LCB 26 1/07/2014
## 6 Liverpool 83 LCB 4 1/01/2018
## contract_end_year national_team national_rating national_team_position
## 1 2021 Argentina 82 RF
## 2 2020 Denmark 78 CAM
## 3 2021 France 84 RDM
## 4 2022 Italy 83 LW
## 5 2021 <NA> NA <NA>
## 6 2023 Netherlands 81 LCB
## national_jersey_number crossing finishing heading_accuracy short_passing
## 1 10 86 95 70 92
## 2 10 88 81 52 91
## 3 6 80 75 75 86
## 4 10 86 77 56 85
## 5 NA 30 22 83 68
## 6 4 53 52 83 79
## volleys dribbling curve freekick_accuracy long_passing ball_control
## 1 86 97 93 94 89 96
## 2 80 84 86 87 89 91
## 3 85 87 85 82 90 90
## 4 74 90 87 77 78 93
## 5 14 69 28 28 60 63
## 6 45 70 60 70 81 76
## acceleration sprint_speed agility reactions balance shot_power jumping
## 1 91 86 93 95 95 85 68
## 2 76 73 80 88 81 84 50
## 3 71 79 76 82 66 90 83
## 4 94 86 94 83 93 75 53
## 5 70 75 50 82 40 55 81
## 6 74 77 61 87 49 81 88
## stamina strength long_shots aggression interceptions positioning vision
## 1 72 66 94 48 22 94 94
## 2 92 58 89 46 56 84 91
## 3 88 87 82 78 64 82 88
## 4 75 44 84 34 26 83 87
## 5 75 94 15 87 88 24 49
## 6 75 92 64 82 88 41 60
## penalties composure marking standing_tackle sliding_tackle GK_diving
## 1 75 96 33 28 26 6
## 2 67 88 59 57 22 9
## 3 82 87 63 67 67 5
## 4 61 83 51 24 22 8
## 5 33 80 91 88 87 7
## 6 62 87 90 89 84 13
## GK_handling GK_kicking GK_positioning GK_reflexes
## 1 11 15 14 8
## 2 14 7 7 6
## 3 6 2 4 3
## 4 4 14 9 10
## 5 11 7 13 5
## 6 10 13 11 11
## tags
## 1 #Dribbler,#Distance Shooter,#Crosser,#FK Specialist,#Acrobat,#Clinical Finisher,#Complete Forward
## 2 #Playmaker ,#Crosser,#FK Specialist,#Complete Midfielder
## 3 #Dribbler,#Playmaker ,#Strength,#Complete Midfielder
## 4 #Speedster,#Dribbler,#Crosser,#Acrobat
## 5 #Tackling ,#Tactician ,#Strength,#Complete Defender
## 6 #Tactician ,#Strength
## traits
## 1 Finesse Shot,Long Shot Taker (CPU AI Only),Speed Dribbler (CPU AI Only),Playmaker (CPU AI Only),One Club Player,Team Player,Chip Shot (CPU AI Only)
## 2 Flair,Long Shot Taker (CPU AI Only),Playmaker (CPU AI Only),Technical Dribbler (CPU AI Only),Takes Finesse Free Kicks
## 3 Flair,Long Passer (CPU AI Only),Long Shot Taker (CPU AI Only),Playmaker (CPU AI Only),Technical Dribbler (CPU AI Only)
## 4 Finesse Shot,Long Shot Taker (CPU AI Only),Speed Dribbler (CPU AI Only),Takes Finesse Free Kicks
## 5 Power Header
## 6 Injury Free,Leadership,Power Header
## LS ST RS LW LF CF RF RW LAM CAM RAM LM LCM CM RCM
## 1 89+2 89+2 89+2 93+2 93+2 93+2 93+2 93+2 93+2 93+2 93+2 91+2 85+2 85+2 85+2
## 2 79+3 79+3 79+3 85+3 84+3 84+3 84+3 85+3 86+3 86+3 86+3 86+3 85+3 85+3 85+3
## 3 81+3 81+3 81+3 82+3 83+3 83+3 83+3 82+3 84+3 84+3 84+3 83+3 84+3 84+3 84+3
## 4 78+3 78+3 78+3 86+3 85+3 85+3 85+3 86+3 86+3 86+3 86+3 86+3 78+3 78+3 78+3
## 5 53+3 53+3 53+3 53+3 54+3 54+3 54+3 53+3 55+3 55+3 55+3 57+3 61+3 61+3 61+3
## 6 68+3 68+3 68+3 66+3 67+3 67+3 67+3 66+3 68+3 68+3 68+3 68+3 73+3 73+3 73+3
## RM LWB LDM CDM RDM RWB LB LCB CB RCB RB
## 1 91+2 64+2 61+2 61+2 61+2 64+2 59+2 48+2 48+2 48+2 59+2
## 2 86+3 71+3 71+3 71+3 71+3 71+3 66+3 57+3 57+3 57+3 66+3
## 3 83+3 76+3 77+3 77+3 77+3 76+3 74+3 72+3 72+3 72+3 74+3
## 4 86+3 63+3 58+3 58+3 58+3 63+3 58+3 44+3 44+3 44+3 58+3
## 5 57+3 73+3 77+3 77+3 77+3 73+3 76+3 85+3 85+3 85+3 76+3
## 6 68+3 78+3 82+3 82+3 82+3 78+3 80+3 86+3 86+3 86+3 80+3
Transformación acorde a los datos
toma_datos=data.frame(
#datos Categoricos
nacionalidad=factor(fifa_cleaned$nationality),
pie_preferido=factor(fifa_cleaned$preferred_foot),
reputacion=factor(fifa_cleaned$international_reputation.1.5.),
pie_debil=factor(fifa_cleaned$weak_foot.1.5.),
habilidad_movimiento=factor( fifa_cleaned$skill_moves.1.5.),
indice_trabajo=factor(fifa_cleaned$work_rate),
tipo_cuerpo=factor(fifa_cleaned$body_type),
equipo_club=factor(fifa_cleaned$club_team),
posicio_club=factor(fifa_cleaned$club_position),
# datos continuos
edad=fifa_cleaned$age,
altura=as.numeric(fifa_cleaned$height_cm),
peso=fifa_cleaned$weight_kgs,
calificacion_general=fifa_cleaned$overall_rating,
potencial=fifa_cleaned$potential,
valor_euro=as.numeric(fifa_cleaned$value_euro),
salario=fifa_cleaned$wage_euro,
#Clausura_contrato=fifa_cleaned$release_clause_euro, #Muchos datos faltantes
rank_club=fifa_cleaned$club_rating,
#puntajes
cruce=fifa_cleaned$crossing,
finalizacion=fifa_cleaned$finishing,
cabezazos=fifa_cleaned$heading_accuracy,
pases_cortos=fifa_cleaned$short_passing,
voleas=fifa_cleaned$volleys,
regateo=fifa_cleaned$dribbling,
curva=fifa_cleaned$curve,
presicion_tiro_libre=fifa_cleaned$freekick_accuracy,
pase_largo=fifa_cleaned$long_passing,
control_balon=fifa_cleaned$ball_control,
aceleracion=fifa_cleaned$acceleration,
spring=fifa_cleaned$sprint_speed,
agilidad=fifa_cleaned$agility,
reaccion=fifa_cleaned$reactions,
balance=fifa_cleaned$balance,
potencia_tiro=fifa_cleaned$shot_power,
salto=fifa_cleaned$jumping,
resistencia=fifa_cleaned$stamina,
fuerza=fifa_cleaned$strength,
tiro_largo=fifa_cleaned$long_shots,
agresividad=fifa_cleaned$aggression,
intersepciones=fifa_cleaned$interceptions,
posicionamiento=fifa_cleaned$positioning,
vision=fifa_cleaned$vision,
penaltis=fifa_cleaned$penalties,
composicion=fifa_cleaned$composure,
marcaje=fifa_cleaned$marking,
entrada_pie=fifa_cleaned$standing_tackle,
entrada_deslisante=fifa_cleaned$sliding_tackle,
#habilidades de portero
arqueo_portero=fifa_cleaned$GK_diving,
manejo_portero=fifa_cleaned$GK_handling,
patada_portero=fifa_cleaned$GK_kicking,
posicionamiento_portero=fifa_cleaned$GK_positioning,
reflejos_portero=fifa_cleaned$GK_reflexes
)
## Warning in data.frame(nacionalidad = factor(fifa_cleaned$nationality),
## pie_preferido = factor(fifa_cleaned$preferred_foot), : NAs introducidos por
## coerción
## Warning in data.frame(nacionalidad = factor(fifa_cleaned$nationality),
## pie_preferido = factor(fifa_cleaned$preferred_foot), : NAs introducidos por
## coerción
#toma_datos
data <- na.omit(toma_datos)
#data
summary(data)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## England : 1599 Left : 3988 1:15569 1: 146 1:1906
## Germany : 1157 Right:13095 2: 1171 2: 3554 2:8043
## Spain : 1042 3: 288 3:10662 3:6196
## France : 898 4: 50 4: 2495 4: 891
## Argentina: 857 5: 5 5: 226 5: 47
## Brazil : 804
## (Other) :10726
## indice_trabajo tipo_cuerpo equipo_club
## Medium/ Medium:9194 Normal :9893 AS Monaco : 33
## High/ Medium :2965 Lean :6163 Atlético Madrid : 33
## Medium/ High :1613 Stocky :1020 Cardiff City : 33
## High/ High : 899 Akinfenwa : 1 Eintracht Frankfurt: 33
## Medium/ Low : 856 C. Ronaldo: 1 FC Barcelona : 33
## High/ Low : 625 Courtois : 1 Manchester City : 33
## (Other) : 931 (Other) : 4 (Other) :16885
## posicio_club edad altura peso
## SUB :7467 Min. :17.00 Min. :152.4 Min. : 49.90
## RES :2785 1st Qu.:22.00 1st Qu.:154.9 1st Qu.: 69.90
## LCB : 625 Median :25.00 Median :175.3 Median : 74.80
## RCB : 625 Mean :25.54 Mean :174.8 Mean : 75.31
## GK : 591 3rd Qu.:29.00 3rd Qu.:185.4 3rd Qu.: 79.80
## LB : 524 Max. :46.00 Max. :205.7 Max. :110.20
## (Other):4466
## calificacion_general potencial valor_euro salario
## Min. :47.00 Min. :50.00 Min. : 10000 Min. : 1000
## 1st Qu.:62.00 1st Qu.:67.00 1st Qu.: 325000 1st Qu.: 1000
## Median :66.00 Median :71.00 Median : 700000 Median : 3000
## Mean :66.21 Mean :71.44 Mean : 2479941 Mean : 9909
## 3rd Qu.:71.00 3rd Qu.:75.00 3rd Qu.: 2100000 3rd Qu.: 9000
## Max. :94.00 Max. :95.00 Max. :110500000 Max. :565000
##
## rank_club cruce finalizacion cabezazos
## Min. :54.00 Min. : 5.00 Min. : 2.00 Min. : 4.00
## 1st Qu.:66.00 1st Qu.:38.00 1st Qu.:30.00 1st Qu.:45.00
## Median :69.00 Median :54.00 Median :49.00 Median :56.00
## Mean :69.35 Mean :49.84 Mean :45.45 Mean :52.37
## 3rd Qu.:72.00 3rd Qu.:64.00 3rd Qu.:62.00 3rd Qu.:64.00
## Max. :86.00 Max. :93.00 Max. :95.00 Max. :93.00
##
## pases_cortos voleas regateo curva
## Min. : 7.00 Min. : 3.00 Min. : 4.00 Min. : 6.00
## 1st Qu.:54.00 1st Qu.:30.00 1st Qu.:49.00 1st Qu.:34.00
## Median :62.00 Median :44.00 Median :61.00 Median :49.00
## Mean :58.68 Mean :42.86 Mean :55.42 Mean :47.23
## 3rd Qu.:68.00 3rd Qu.:57.00 3rd Qu.:68.00 3rd Qu.:62.00
## Max. :93.00 Max. :90.00 Max. :97.00 Max. :94.00
##
## presicion_tiro_libre pase_largo control_balon aceleracion
## Min. : 3.00 Min. : 9.00 Min. : 5.00 Min. :12.00
## 1st Qu.:31.00 1st Qu.:43.00 1st Qu.:54.00 1st Qu.:57.00
## Median :41.00 Median :56.00 Median :63.00 Median :67.00
## Mean :42.76 Mean :52.74 Mean :58.37 Mean :64.78
## 3rd Qu.:56.00 3rd Qu.:64.00 3rd Qu.:69.00 3rd Qu.:75.00
## Max. :94.00 Max. :93.00 Max. :96.00 Max. :97.00
##
## spring agilidad reaccion balance
## Min. :12.00 Min. :11.00 Min. :24.00 Min. :16.00
## 1st Qu.:58.00 1st Qu.:55.00 1st Qu.:56.00 1st Qu.:56.00
## Median :68.00 Median :66.00 Median :62.00 Median :66.00
## Mean :64.89 Mean :63.45 Mean :61.81 Mean :63.89
## 3rd Qu.:75.00 3rd Qu.:74.00 3rd Qu.:68.00 3rd Qu.:74.00
## Max. :96.00 Max. :96.00 Max. :96.00 Max. :96.00
##
## potencia_tiro salto resistencia fuerza tiro_largo
## Min. : 2.00 Min. :15.00 Min. :12.00 Min. :20.0 Min. : 3.00
## 1st Qu.:45.00 1st Qu.:58.00 1st Qu.:56.00 1st Qu.:58.0 1st Qu.:32.00
## Median :59.00 Median :66.00 Median :66.00 Median :67.0 Median :51.00
## Mean :55.45 Mean :65.14 Mean :63.29 Mean :65.3 Mean :46.95
## 3rd Qu.:68.00 3rd Qu.:73.00 3rd Qu.:74.00 3rd Qu.:74.0 3rd Qu.:62.00
## Max. :95.00 Max. :95.00 Max. :96.00 Max. :97.0 Max. :94.00
##
## agresividad intersepciones posicionamiento vision
## Min. :11.00 Min. : 5.00 Min. : 2 Min. :10.00
## 1st Qu.:44.00 1st Qu.:26.00 1st Qu.:39 1st Qu.:44.00
## Median :59.00 Median :53.00 Median :55 Median :55.00
## Mean :56.01 Mean :46.86 Mean :50 Mean :53.47
## 3rd Qu.:69.00 3rd Qu.:64.00 3rd Qu.:64 3rd Qu.:64.00
## Max. :95.00 Max. :92.00 Max. :95 Max. :94.00
##
## penaltis composicion marcaje entrada_pie
## Min. : 5.00 Min. :12.00 Min. : 3.00 Min. : 2.00
## 1st Qu.:39.00 1st Qu.:51.00 1st Qu.:30.00 1st Qu.:27.00
## Median :49.00 Median :60.00 Median :53.00 Median :55.00
## Mean :48.44 Mean :58.71 Mean :47.33 Mean :47.94
## 3rd Qu.:60.00 3rd Qu.:67.00 3rd Qu.:64.00 3rd Qu.:66.00
## Max. :92.00 Max. :96.00 Max. :94.00 Max. :93.00
##
## entrada_deslisante arqueo_portero manejo_portero patada_portero
## Min. : 3.00 Min. : 1.00 Min. : 1.00 Min. : 1.0
## 1st Qu.:24.00 1st Qu.: 8.00 1st Qu.: 8.00 1st Qu.: 8.0
## Median :52.00 Median :11.00 Median :11.00 Median :11.0
## Mean :45.91 Mean :16.58 Mean :16.37 Mean :16.2
## 3rd Qu.:64.00 3rd Qu.:14.00 3rd Qu.:14.00 3rd Qu.:14.0
## Max. :90.00 Max. :90.00 Max. :92.00 Max. :92.0
##
## posicionamiento_portero reflejos_portero
## Min. : 1.00 Min. : 1.00
## 1st Qu.: 8.00 1st Qu.: 8.00
## Median :11.00 Median :11.00
## Mean :16.34 Mean :16.67
## 3rd Qu.:14.00 3rd Qu.:14.00
## Max. :90.00 Max. :94.00
##
# Identificar las columnas numéricas
numeric_columns <- sapply(data, is.numeric)
# Crear histogramas para cada columna numérica
par(mfrow = c(3, 3)) # Ajustar la disposición de las gráficas (en este caso 2x2)
for (col in names(data)[numeric_columns]) {
hist(data[[col]], main = paste("Histograma de", col), xlab = col, col = "lightblue", border = "black")
}
Veamos que tenemos problemas con los histogramas de valor_euro (Valor en el mercado) y salario
hist(data[['valor_euro']], main = paste("Histograma de", col), xlab = col, col = "lightblue", border = "black")
# Crear diagramas de caja para cada columna numérica
par(mfrow = c(1, 3)) # Ajustar la disposición de las gráficas
for (col in names(data)[numeric_columns]) {
boxplot(data[[col]], main = paste("Diagrama de caja de", col), ylab = col, col = "lightgreen", border = "black")
}
Vemos problemas en salario , valor euro y en las ultimas variables que son particularmente para evaluar arqueros
Dado que la matriz de valor_euro y salario tienen muchos datos atípicos, vamos hacerles una transformación logarítmica
boxplot(data[['valor_euro']], main = paste("Diagrama de caja de",'valor_euro'), ylab = col, col = "lightgreen", border = "black")
df <- data.frame(
Valor = c(50, 20, 30, 10, 40)
)
df$Valor[order(df$Valor)]
## [1] 10 20 30 40 50
plot(data[['valor_euro']])
plot(data$valor_euro[order(data[['valor_euro']])])
plot(log(data$valor_euro[order(data[['valor_euro']])]))
aplicando ln para ‘valor_euro’:
boxplot(log(data[['valor_euro']]), main = paste("Diagrama de caja de",'ln_valor_euro'), ylab = col, col = "lightgreen", border = "black")
y para ‘salario’:
boxplot(log(data[['salario']]), main = paste("Diagrama de caja de",'ln_salario'), ylab = col, col = "lightgreen", border = "black")
Vamos a tomar los datos “data” y le vamos agregar las respectivas transformaciones en columnas nuevas
data$ln_salario=log(data[['salario']])
data$ln_valor_euro=log(data[['valor_euro']])
boxplot(data[['ln_salario']], main = paste("Diagrama de caja de",'ln_salario'), ylab = col, col = "lightgreen", border = "black")
boxplot(data[['ln_valor_euro']], main = paste("Diagrama de caja de",'ln_valor_euro'), ylab = col, col = "lightgreen", border = "black")
Veamos todas nuevamente
numeric_columns <- sapply(data, is.numeric)
# Crear diagramas de caja para cada columna numérica
par(mfrow = c(1, 3)) # Ajustar la disposición de las gráficas
for (col in names(data)[numeric_columns]) {
boxplot(data[[col]], main = paste("Diagrama de caja de", col), ylab = col, col = "lightgreen", border = "black")
}
Dada la gran diferencia entre arqueros y no arqueros, vamos a tomar los jugadores que no son arqueros, para esto, vamos a usar la posición que tienen en el club, si esta posición es de arquero (GK) lo quitamos
head(data)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## 1 Argentina Left 5 4 4
## 2 Denmark Right 3 5 4
## 3 France Right 4 4 5
## 4 Italy Right 3 4 4
## 5 Senegal Right 3 3 2
## 6 Netherlands Right 3 3 2
## indice_trabajo tipo_cuerpo equipo_club posicio_club edad altura peso
## 1 Medium/ Low Messi FC Barcelona RW 31 170.18 72.1
## 2 High/ Medium Lean Tottenham Hotspur LCM 27 154.94 76.2
## 3 High/ Medium Normal Manchester United LCM 25 190.50 83.9
## 4 High/ Medium Normal Napoli LS 27 162.56 59.0
## 5 High/ High Normal Napoli LCB 27 187.96 88.9
## 6 Medium/ Medium Normal Liverpool LCB 27 193.04 92.1
## calificacion_general potencial valor_euro salario rank_club cruce
## 1 94 94 110500000 565000 86 86
## 2 88 89 69500000 205000 83 88
## 3 88 91 73000000 255000 82 80
## 4 88 88 62000000 165000 82 86
## 5 88 91 60000000 135000 82 30
## 6 88 90 59500000 215000 83 53
## finalizacion cabezazos pases_cortos voleas regateo curva presicion_tiro_libre
## 1 95 70 92 86 97 93 94
## 2 81 52 91 80 84 86 87
## 3 75 75 86 85 87 85 82
## 4 77 56 85 74 90 87 77
## 5 22 83 68 14 69 28 28
## 6 52 83 79 45 70 60 70
## pase_largo control_balon aceleracion spring agilidad reaccion balance
## 1 89 96 91 86 93 95 95
## 2 89 91 76 73 80 88 81
## 3 90 90 71 79 76 82 66
## 4 78 93 94 86 94 83 93
## 5 60 63 70 75 50 82 40
## 6 81 76 74 77 61 87 49
## potencia_tiro salto resistencia fuerza tiro_largo agresividad intersepciones
## 1 85 68 72 66 94 48 22
## 2 84 50 92 58 89 46 56
## 3 90 83 88 87 82 78 64
## 4 75 53 75 44 84 34 26
## 5 55 81 75 94 15 87 88
## 6 81 88 75 92 64 82 88
## posicionamiento vision penaltis composicion marcaje entrada_pie
## 1 94 94 75 96 33 28
## 2 84 91 67 88 59 57
## 3 82 88 82 87 63 67
## 4 83 87 61 83 51 24
## 5 24 49 33 80 91 88
## 6 41 60 62 87 90 89
## entrada_deslisante arqueo_portero manejo_portero patada_portero
## 1 26 6 11 15
## 2 22 9 14 7
## 3 67 5 6 2
## 4 22 8 4 14
## 5 87 7 11 7
## 6 84 13 10 13
## posicionamiento_portero reflejos_portero ln_salario ln_valor_euro
## 1 14 8 13.24458 18.52053
## 2 7 6 12.23077 18.05684
## 3 4 3 12.44902 18.10597
## 4 9 10 12.01370 17.94264
## 5 13 5 11.81303 17.90986
## 6 11 11 12.27839 17.90149
# Crear el data frame excluyendo los valores 'GK' en la columna 'posicio_club'
data_sin_GK <- subset(data, posicio_club != "GK")
# Crear el data frame con solo los valores 'GK' en la columna 'posicio_club'
data_solo_GK <- subset(data, posicio_club == "GK")
# Verificar las dimensiones de los nuevos data frames
print(dim(data))
## [1] 17083 53
print(dim(data_sin_GK))
## [1] 16492 53
print(dim(data_solo_GK))
## [1] 591 53
Veamos un histograma si de verdad quitamos todos los arqueros:
# Crear un gráfico de barras para la columna 'posicio_club'
barplot(
table(data_sin_GK$posicio_club), # Conteo de cada categoría
main = "Frecuencia de posiciones en 'posicio_club'",
xlab = "Posiciones",
ylab = "Frecuencia",
col = "lightblue",
border = "black",
las = 2 # Rotar etiquetas en el eje x si son largas
)
Vemos la frecuencia de GK
frecuencias <- table(data_sin_GK$posicio_club)
frecuencias["GK"]
## GK
## 0
Diagrama de caja sin arqueros
numeric_columns <- sapply(data_sin_GK, is.numeric)
# Crear diagramas de caja para cada columna numérica
par(mfrow = c(1, 3)) # Ajustar la disposición de las gráficas
for (col in names(data_sin_GK)[numeric_columns]) {
boxplot(data_sin_GK[[col]], main = paste("Diagrama de caja de", col), ylab = col, col = "lightgreen", border = "black")
}
Diagrama de caja de solo arqueros
numeric_columns <- sapply(data_solo_GK, is.numeric)
# Crear diagramas de caja para cada columna numérica
par(mfrow = c(1, 3)) # Ajustar la disposición de las gráficas
for (col in names(data_solo_GK)[numeric_columns]) {
boxplot(data_solo_GK[[col]], main = paste("Diagrama de caja de", col), ylab = col, col = "lightgreen", border = "black")
}
Note que los últimos diagrama de caja de los jugadores que son arqueros en sus clubes están mucho mejor y los jugadores que no son arqueros en estas variables tiene un comportamiento desigual y no tiene sentido evaluar un jugador no arquero por sus habilidades de arquero, es por esto que en el estudio no se tendrán en cuenta estas variables exclusivas de arqueros
head(data_sin_GK)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## 1 Argentina Left 5 4 4
## 2 Denmark Right 3 5 4
## 3 France Right 4 4 5
## 4 Italy Right 3 4 4
## 5 Senegal Right 3 3 2
## 6 Netherlands Right 3 3 2
## indice_trabajo tipo_cuerpo equipo_club posicio_club edad altura peso
## 1 Medium/ Low Messi FC Barcelona RW 31 170.18 72.1
## 2 High/ Medium Lean Tottenham Hotspur LCM 27 154.94 76.2
## 3 High/ Medium Normal Manchester United LCM 25 190.50 83.9
## 4 High/ Medium Normal Napoli LS 27 162.56 59.0
## 5 High/ High Normal Napoli LCB 27 187.96 88.9
## 6 Medium/ Medium Normal Liverpool LCB 27 193.04 92.1
## calificacion_general potencial valor_euro salario rank_club cruce
## 1 94 94 110500000 565000 86 86
## 2 88 89 69500000 205000 83 88
## 3 88 91 73000000 255000 82 80
## 4 88 88 62000000 165000 82 86
## 5 88 91 60000000 135000 82 30
## 6 88 90 59500000 215000 83 53
## finalizacion cabezazos pases_cortos voleas regateo curva presicion_tiro_libre
## 1 95 70 92 86 97 93 94
## 2 81 52 91 80 84 86 87
## 3 75 75 86 85 87 85 82
## 4 77 56 85 74 90 87 77
## 5 22 83 68 14 69 28 28
## 6 52 83 79 45 70 60 70
## pase_largo control_balon aceleracion spring agilidad reaccion balance
## 1 89 96 91 86 93 95 95
## 2 89 91 76 73 80 88 81
## 3 90 90 71 79 76 82 66
## 4 78 93 94 86 94 83 93
## 5 60 63 70 75 50 82 40
## 6 81 76 74 77 61 87 49
## potencia_tiro salto resistencia fuerza tiro_largo agresividad intersepciones
## 1 85 68 72 66 94 48 22
## 2 84 50 92 58 89 46 56
## 3 90 83 88 87 82 78 64
## 4 75 53 75 44 84 34 26
## 5 55 81 75 94 15 87 88
## 6 81 88 75 92 64 82 88
## posicionamiento vision penaltis composicion marcaje entrada_pie
## 1 94 94 75 96 33 28
## 2 84 91 67 88 59 57
## 3 82 88 82 87 63 67
## 4 83 87 61 83 51 24
## 5 24 49 33 80 91 88
## 6 41 60 62 87 90 89
## entrada_deslisante arqueo_portero manejo_portero patada_portero
## 1 26 6 11 15
## 2 22 9 14 7
## 3 67 5 6 2
## 4 22 8 4 14
## 5 87 7 11 7
## 6 84 13 10 13
## posicionamiento_portero reflejos_portero ln_salario ln_valor_euro
## 1 14 8 13.24458 18.52053
## 2 7 6 12.23077 18.05684
## 3 4 3 12.44902 18.10597
## 4 9 10 12.01370 17.94264
## 5 13 5 11.81303 17.90986
## 6 11 11 12.27839 17.90149
names(data_sin_GK)
## [1] "nacionalidad" "pie_preferido"
## [3] "reputacion" "pie_debil"
## [5] "habilidad_movimiento" "indice_trabajo"
## [7] "tipo_cuerpo" "equipo_club"
## [9] "posicio_club" "edad"
## [11] "altura" "peso"
## [13] "calificacion_general" "potencial"
## [15] "valor_euro" "salario"
## [17] "rank_club" "cruce"
## [19] "finalizacion" "cabezazos"
## [21] "pases_cortos" "voleas"
## [23] "regateo" "curva"
## [25] "presicion_tiro_libre" "pase_largo"
## [27] "control_balon" "aceleracion"
## [29] "spring" "agilidad"
## [31] "reaccion" "balance"
## [33] "potencia_tiro" "salto"
## [35] "resistencia" "fuerza"
## [37] "tiro_largo" "agresividad"
## [39] "intersepciones" "posicionamiento"
## [41] "vision" "penaltis"
## [43] "composicion" "marcaje"
## [45] "entrada_pie" "entrada_deslisante"
## [47] "arqueo_portero" "manejo_portero"
## [49] "patada_portero" "posicionamiento_portero"
## [51] "reflejos_portero" "ln_salario"
## [53] "ln_valor_euro"
Quitamos las variables para arqueros
library(dplyr)
##
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data_sin_GK <- select(data_sin_GK, -reflejos_portero)
data_sin_GK <- select(data_sin_GK, -posicionamiento_portero)
data_sin_GK <- select(data_sin_GK, -patada_portero)
data_sin_GK <- select(data_sin_GK, -arqueo_portero)
data_sin_GK <- select(data_sin_GK, -manejo_portero)
head(data_sin_GK)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## 1 Argentina Left 5 4 4
## 2 Denmark Right 3 5 4
## 3 France Right 4 4 5
## 4 Italy Right 3 4 4
## 5 Senegal Right 3 3 2
## 6 Netherlands Right 3 3 2
## indice_trabajo tipo_cuerpo equipo_club posicio_club edad altura peso
## 1 Medium/ Low Messi FC Barcelona RW 31 170.18 72.1
## 2 High/ Medium Lean Tottenham Hotspur LCM 27 154.94 76.2
## 3 High/ Medium Normal Manchester United LCM 25 190.50 83.9
## 4 High/ Medium Normal Napoli LS 27 162.56 59.0
## 5 High/ High Normal Napoli LCB 27 187.96 88.9
## 6 Medium/ Medium Normal Liverpool LCB 27 193.04 92.1
## calificacion_general potencial valor_euro salario rank_club cruce
## 1 94 94 110500000 565000 86 86
## 2 88 89 69500000 205000 83 88
## 3 88 91 73000000 255000 82 80
## 4 88 88 62000000 165000 82 86
## 5 88 91 60000000 135000 82 30
## 6 88 90 59500000 215000 83 53
## finalizacion cabezazos pases_cortos voleas regateo curva presicion_tiro_libre
## 1 95 70 92 86 97 93 94
## 2 81 52 91 80 84 86 87
## 3 75 75 86 85 87 85 82
## 4 77 56 85 74 90 87 77
## 5 22 83 68 14 69 28 28
## 6 52 83 79 45 70 60 70
## pase_largo control_balon aceleracion spring agilidad reaccion balance
## 1 89 96 91 86 93 95 95
## 2 89 91 76 73 80 88 81
## 3 90 90 71 79 76 82 66
## 4 78 93 94 86 94 83 93
## 5 60 63 70 75 50 82 40
## 6 81 76 74 77 61 87 49
## potencia_tiro salto resistencia fuerza tiro_largo agresividad intersepciones
## 1 85 68 72 66 94 48 22
## 2 84 50 92 58 89 46 56
## 3 90 83 88 87 82 78 64
## 4 75 53 75 44 84 34 26
## 5 55 81 75 94 15 87 88
## 6 81 88 75 92 64 82 88
## posicionamiento vision penaltis composicion marcaje entrada_pie
## 1 94 94 75 96 33 28
## 2 84 91 67 88 59 57
## 3 82 88 82 87 63 67
## 4 83 87 61 83 51 24
## 5 24 49 33 80 91 88
## 6 41 60 62 87 90 89
## entrada_deslisante ln_salario ln_valor_euro
## 1 26 13.24458 18.52053
## 2 22 12.23077 18.05684
## 3 67 12.44902 18.10597
## 4 22 12.01370 17.94264
## 5 87 11.81303 17.90986
## 6 84 12.27839 17.90149
numeric_columns <- sapply(data_sin_GK, is.numeric)
# Crear diagramas de caja para cada columna numérica
par(mfrow = c(1, 3)) # Ajustar la disposición de las gráficas
for (col in names(data_sin_GK)[numeric_columns]) {
boxplot(data_sin_GK[[col]], main = paste("Diagrama de caja de", col), ylab = col, col = "lightgreen", border = "black")
}
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.2
# Filtrar las columnas de tipo factor en data_sin_GK
factor_columns <- sapply(data_sin_GK, is.factor)
# Crear un diagrama de barras más sencillo para cada columna tipo factor
for (col in names(data_sin_GK)[factor_columns]) {
# Convertir la columna a un data frame de frecuencias
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
}
Vemos que el histograma de nacionalidad es muy disparejo, es por esto que vamos a transformar las nacionalidades en continentes
library(countrycode)
## Warning: package 'countrycode' was built under R version 4.4.2
# Lista de países
paises <- data_sin_GK$nacionalidad
# Obtener el continente
continentes <- countrycode(paises, origin = "country.name", destination = "continent")
## Warning: Some values were not matched unambiguously: England, Kosovo, Northern Ireland, Scotland, Wales
# Mostrar los resultados
nacionalidad_continente=data.frame(Pais = paises, Continente = continentes)
#nacionalidad_continente
Dado que todas las nacionalidades que son NA son de Europa, entonces
# Reemplazar los valores NA en la columna Continente con "Europe"
nacionalidad_continente$Continente[is.na(nacionalidad_continente$Continente)] <- "Europe"
# Ver el dataframe actualizado
#print(nacionalidad_continente)
Agreguemos esta columna a los datos
data_sin_GK$Continente=factor(nacionalidad_continente$Continente)
Veamos el histograma por continente
col='Continente'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
col='reputacion'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
Hagamos solo 2 Categorías:
Reputación alta: 2,3,4
Reputación baja: 1
data_sin_GK$new_reputacion <- factor(ifelse(data_sin_GK$reputacion == 1, "bajo", "alta"))
# Verificar los resultados
head(data_sin_GK)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## 1 Argentina Left 5 4 4
## 2 Denmark Right 3 5 4
## 3 France Right 4 4 5
## 4 Italy Right 3 4 4
## 5 Senegal Right 3 3 2
## 6 Netherlands Right 3 3 2
## indice_trabajo tipo_cuerpo equipo_club posicio_club edad altura peso
## 1 Medium/ Low Messi FC Barcelona RW 31 170.18 72.1
## 2 High/ Medium Lean Tottenham Hotspur LCM 27 154.94 76.2
## 3 High/ Medium Normal Manchester United LCM 25 190.50 83.9
## 4 High/ Medium Normal Napoli LS 27 162.56 59.0
## 5 High/ High Normal Napoli LCB 27 187.96 88.9
## 6 Medium/ Medium Normal Liverpool LCB 27 193.04 92.1
## calificacion_general potencial valor_euro salario rank_club cruce
## 1 94 94 110500000 565000 86 86
## 2 88 89 69500000 205000 83 88
## 3 88 91 73000000 255000 82 80
## 4 88 88 62000000 165000 82 86
## 5 88 91 60000000 135000 82 30
## 6 88 90 59500000 215000 83 53
## finalizacion cabezazos pases_cortos voleas regateo curva presicion_tiro_libre
## 1 95 70 92 86 97 93 94
## 2 81 52 91 80 84 86 87
## 3 75 75 86 85 87 85 82
## 4 77 56 85 74 90 87 77
## 5 22 83 68 14 69 28 28
## 6 52 83 79 45 70 60 70
## pase_largo control_balon aceleracion spring agilidad reaccion balance
## 1 89 96 91 86 93 95 95
## 2 89 91 76 73 80 88 81
## 3 90 90 71 79 76 82 66
## 4 78 93 94 86 94 83 93
## 5 60 63 70 75 50 82 40
## 6 81 76 74 77 61 87 49
## potencia_tiro salto resistencia fuerza tiro_largo agresividad intersepciones
## 1 85 68 72 66 94 48 22
## 2 84 50 92 58 89 46 56
## 3 90 83 88 87 82 78 64
## 4 75 53 75 44 84 34 26
## 5 55 81 75 94 15 87 88
## 6 81 88 75 92 64 82 88
## posicionamiento vision penaltis composicion marcaje entrada_pie
## 1 94 94 75 96 33 28
## 2 84 91 67 88 59 57
## 3 82 88 82 87 63 67
## 4 83 87 61 83 51 24
## 5 24 49 33 80 91 88
## 6 41 60 62 87 90 89
## entrada_deslisante ln_salario ln_valor_euro Continente new_reputacion
## 1 26 13.24458 18.52053 Americas alta
## 2 22 12.23077 18.05684 Europe alta
## 3 67 12.44902 18.10597 Europe alta
## 4 22 12.01370 17.94264 Europe alta
## 5 87 11.81303 17.90986 Africa alta
## 6 84 12.27839 17.90149 Europe alta
col='new_reputacion'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
col='tipo_cuerpo'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
library(dplyr)
data_sin_GK <- data_sin_GK %>%
mutate(new_tipo_cuerpo = case_when(
tipo_cuerpo == 'Lean' ~ 'Lean',
tipo_cuerpo == 'Normal' ~ 'Normal',
tipo_cuerpo == 'Stocky' ~ 'Stocky',
TRUE ~ 'Unico' # Para los valores distintos de 'Lean', 'Normal', 'Stocky'
))
# Ver las primeras filas del data frame para comprobar
head(data_sin_GK)
## nacionalidad pie_preferido reputacion pie_debil habilidad_movimiento
## 1 Argentina Left 5 4 4
## 2 Denmark Right 3 5 4
## 3 France Right 4 4 5
## 4 Italy Right 3 4 4
## 5 Senegal Right 3 3 2
## 6 Netherlands Right 3 3 2
## indice_trabajo tipo_cuerpo equipo_club posicio_club edad altura peso
## 1 Medium/ Low Messi FC Barcelona RW 31 170.18 72.1
## 2 High/ Medium Lean Tottenham Hotspur LCM 27 154.94 76.2
## 3 High/ Medium Normal Manchester United LCM 25 190.50 83.9
## 4 High/ Medium Normal Napoli LS 27 162.56 59.0
## 5 High/ High Normal Napoli LCB 27 187.96 88.9
## 6 Medium/ Medium Normal Liverpool LCB 27 193.04 92.1
## calificacion_general potencial valor_euro salario rank_club cruce
## 1 94 94 110500000 565000 86 86
## 2 88 89 69500000 205000 83 88
## 3 88 91 73000000 255000 82 80
## 4 88 88 62000000 165000 82 86
## 5 88 91 60000000 135000 82 30
## 6 88 90 59500000 215000 83 53
## finalizacion cabezazos pases_cortos voleas regateo curva presicion_tiro_libre
## 1 95 70 92 86 97 93 94
## 2 81 52 91 80 84 86 87
## 3 75 75 86 85 87 85 82
## 4 77 56 85 74 90 87 77
## 5 22 83 68 14 69 28 28
## 6 52 83 79 45 70 60 70
## pase_largo control_balon aceleracion spring agilidad reaccion balance
## 1 89 96 91 86 93 95 95
## 2 89 91 76 73 80 88 81
## 3 90 90 71 79 76 82 66
## 4 78 93 94 86 94 83 93
## 5 60 63 70 75 50 82 40
## 6 81 76 74 77 61 87 49
## potencia_tiro salto resistencia fuerza tiro_largo agresividad intersepciones
## 1 85 68 72 66 94 48 22
## 2 84 50 92 58 89 46 56
## 3 90 83 88 87 82 78 64
## 4 75 53 75 44 84 34 26
## 5 55 81 75 94 15 87 88
## 6 81 88 75 92 64 82 88
## posicionamiento vision penaltis composicion marcaje entrada_pie
## 1 94 94 75 96 33 28
## 2 84 91 67 88 59 57
## 3 82 88 82 87 63 67
## 4 83 87 61 83 51 24
## 5 24 49 33 80 91 88
## 6 41 60 62 87 90 89
## entrada_deslisante ln_salario ln_valor_euro Continente new_reputacion
## 1 26 13.24458 18.52053 Americas alta
## 2 22 12.23077 18.05684 Europe alta
## 3 67 12.44902 18.10597 Europe alta
## 4 22 12.01370 17.94264 Europe alta
## 5 87 11.81303 17.90986 Africa alta
## 6 84 12.27839 17.90149 Europe alta
## new_tipo_cuerpo
## 1 Unico
## 2 Lean
## 3 Normal
## 4 Normal
## 5 Normal
## 6 Normal
data_sin_GK$new_tipo_cuerpo=factor(data_sin_GK$new_tipo_cuerpo)
col='new_tipo_cuerpo'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
col='equipo_club'
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de barras con colores más sencillos
p <- ggplot(data_plot, aes(x = Category, y = Frequency)) +
geom_bar(stat = "identity", fill = "lightblue", color = "black") + # Colores más sencillos
labs(title = paste("Diagrama de Barras de", col)) +
theme_minimal() + # Tema limpio
theme(
legend.title = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1) # Rotar etiquetas del eje X si es necesario
)
# Mostrar el gráfico
print(p)
library(ggplot2)
# Identificar columnas tipo factor
columnas_factores <- names(data_sin_GK)[sapply(data_sin_GK, is.factor)]
# Crear un diagrama de torta para cada columna tipo factor
for (col in columnas_factores) {
# Excluir las columnas 'equipo_club' y 'nacionalidad'
if (col %in% c("equipo_club", "nacionalidad")) {
next # Saltar si es 'equipo_club' o 'nacionalidad'
}
# Convertir la columna a un data frame de frecuencias
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico
p <- ggplot(data_plot, aes(x = "", y = Frequency, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar(theta = "y") +
labs(title = paste("Diagrama de Torta de", col)) +
theme_void() +
theme(legend.title = element_blank())
# Mostrar el gráfico
print(p)
}
# Crear un diagrama de torta para 'nacionalidad' y 'equipo_club' sin etiquetas ni leyendas
for (col in c("nacionalidad", "equipo_club")) {
# Convertir la columna a un data frame de frecuencias
data_plot <- as.data.frame(table(data_sin_GK[[col]]))
colnames(data_plot) <- c("Category", "Frequency")
# Crear el gráfico de torta sin etiquetas ni leyendas
p <- ggplot(data_plot, aes(x = "", y = Frequency, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar(theta = "y") +
labs(title = paste("Diagrama de Torta de", col)) +
theme_void() +
theme(
legend.position = "none", # Eliminar leyenda
axis.text = element_blank(), # Eliminar etiquetas de las secciones
axis.title = element_blank(), # Eliminar títulos de los ejes
panel.grid = element_blank() # Eliminar líneas de la cuadrícula
)
# Mostrar el gráfico
print(p)
}
# Identificar las columnas de tipo factor
columnas_factores <- sapply(data_sin_GK, is.factor)
# Crear un boxplot para ln_salario por cada columna de tipo factor
for (factor_col in names(data_sin_GK)[columnas_factores]) {
# Excluir combinaciones si es necesario (por ejemplo, 'equipo_club' o 'nacionalidad')
if (factor_col %in% c("equipo_club", "nacionalidad")) {
next
}
# Crear el boxplot
plot_title <- paste("Boxplot de ln_salario por", factor_col)
print(boxplot(data_sin_GK$ln_salario ~ data_sin_GK[[factor_col]],
main = plot_title,
las = 1))
}
## $stats
## [,1] [,2]
## [1,] 6.907755 6.907755
## [2,] 7.600902 6.907755
## [3,] 8.294050 8.006368
## [4,] 9.210340 9.104980
## [5,] 11.608236 12.388394
##
## $n
## [1] 3916 12576
##
## $conf
## [,1] [,2]
## [1,] 8.253414 7.975410
## [2,] 8.334686 8.037325
##
## $out
## [1] 13.24458 11.88449 12.48749 12.23077 11.95118 12.38839 12.42922 12.46844
## [9] 12.18075 12.04355 11.81303 11.84940 11.69525 11.65269 11.65269 11.69525
## [17] 11.65269 11.95118 11.84940 11.84940 12.01370 11.77529 11.65269 12.01370
## [25] 11.88449 12.07254 11.84940 11.98293 12.23077 11.91839 11.98293 12.27839
## [33] 12.10071 12.18075 12.23077 12.77987 12.32386 12.56024 12.44902 12.61154
## [41] 12.66033 12.46844 12.46844 12.48749 12.56024 12.48749 12.73670 12.64433
## [49] 12.77987 13.02805 12.94801 12.73670 12.77987 12.57764 12.91164
##
## $group
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [39] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##
## $names
## [1] "Left" "Right"
## $stats
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6.907755 8.006368 8.853665 9.903488 12.57764
## [2,] 6.907755 9.472705 10.434116 11.429544 12.74464
## [3,] 8.006368 10.165852 11.156251 12.043554 12.96985
## [4,] 8.853665 10.668955 11.695247 12.468437 13.13632
## [5,] 11.736069 12.230765 12.736701 12.948010 13.24458
##
## $n
## [1] 15057 1116 270 45 4
##
## $conf
## [,1] [,2] [,3] [,4] [,5]
## [1,] 7.981312 10.10927 11.03499 11.79886 12.66042
## [2,] 8.031424 10.22243 11.27752 12.28825 13.27927
##
## $out
## [1] 11.813030 11.849398 6.907755 6.907755 6.907755 6.907755 6.907755
## [8] 6.907755 7.600902 6.907755 7.600902 7.600902 7.600902 7.600902
## [15] 6.907755 6.907755 7.600902 6.907755 7.600902 6.907755 7.600902
## [22] 7.600902 7.600902 6.907755 7.600902 7.600902 6.907755 7.600902
## [29] 7.600902 7.600902 7.600902 6.907755 7.600902 7.600902 7.600902
## [36] 6.907755 7.600902 7.600902 6.907755 6.907755 6.907755 6.907755
## [43] 6.907755 6.907755 6.907755 7.600902 6.907755 6.907755 6.907755
## [50] 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755 8.294050
## [57] 8.294050 8.517193 6.907755 6.907755 9.852194 9.472705 9.210340
## [64] 9.798127
##
## $group
## [1] 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [39] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4
##
## $names
## [1] "1" "2" "3" "4" "5"
## $stats
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 6.907755 6.907755 6.907755 7.600902 7.600902
## [3,] 7.600902 7.600902 8.006368 8.699515 8.699515
## [4,] 8.699515 8.699515 8.987197 9.852194 9.998798
## [5,] 10.950807 11.362103 12.100712 13.028053 12.779873
##
## $n
## [1] 113 3316 10373 2470 220
##
## $conf
## [,1] [,2] [,3] [,4] [,5]
## [1,] 7.334586 7.551740 7.974109 8.627943 8.444082
## [2,] 7.867219 7.650064 8.038627 8.771086 8.954947
##
## $out
## [1] 11.69525 11.50288 11.45105 11.60824 11.69525 11.65269 11.39639 11.42954
## [9] 11.47210 11.56172 11.65269 11.77529 11.60824 11.60824 12.01370 11.88449
## [17] 12.07254 11.84940 11.47210 12.56024 12.27839 12.66033 12.48749 12.23077
## [25] 12.42922 12.18075 12.46844 12.46844 12.38839 12.18075 12.18075 12.23077
## [33] 12.27839 12.25486 12.18075 12.25486 12.23077 12.56024 12.77987 12.32386
## [41] 12.64433 13.24458
##
## $group
## [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [39] 3 3 3 4
##
## $names
## [1] "1" "2" "3" "4" "5"
## $stats
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 6.907755 6.907755 8.006368 9.104980 9.678387
## [3,] 6.907755 7.600902 8.517193 9.998798 10.896739
## [4,] 8.294050 8.294050 9.546813 10.691945 11.691763
## [5,] 10.373491 10.373491 11.849398 12.948010 12.911642
##
## $n
## [1] 1315 8043 6196 891 47
##
## $conf
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6.847354 7.576479 8.486273 9.914797 10.43272
## [2,] 6.968157 7.625326 8.548114 10.082799 11.36076
##
## $out
## [1] 12.18075 10.62133 10.69194 10.75790 10.59663 10.85900 10.43412 10.69194
## [9] 10.57132 10.49127 10.49127 10.43412 10.57132 10.51867 10.69194 10.66896
## [17] 10.81978 10.71442 11.41861 11.01863 10.85900 10.81978 10.46310 11.81303
## [25] 11.56172 11.81303 12.27839 11.39639 12.01370 12.38839 11.05089 11.56172
## [33] 11.42954 11.81303 10.57132 10.73640 10.49127 11.28978 10.43412 11.50288
## [41] 10.79958 10.91509 10.54534 11.19821 10.51867 10.51867 10.46310 11.60824
## [49] 11.45105 10.89674 10.73640 10.77896 10.87805 10.96820 10.87805 11.60824
## [57] 10.46310 10.87805 10.73640 10.81978 10.69194 10.77896 10.64542 10.87805
## [65] 10.51867 10.71442 10.81978 10.77896 10.59663 10.46310 10.71442 10.71442
## [73] 10.64542 10.69194 10.83958 10.64542 10.40426 11.05089 10.62133 10.71442
## [81] 10.51867 10.57132 10.51867 10.66896 10.49127 10.73640 10.46310 10.51867
## [89] 10.40426 10.73640 10.40426 10.81978 10.83958 10.46310 10.69194 10.62133
## [97] 10.43412 10.49127 10.59663 10.69194 10.40426 10.77896 10.93311 10.46310
## [105] 10.57132 10.46310 10.57132 10.73640 10.43412 10.64542 10.79958 10.40426
## [113] 10.59663 10.49127 10.75790 10.59663 10.54534 10.40426 10.59663 10.64542
## [121] 10.51867 10.59663 10.54534 10.46310 10.77896 10.73640 11.06664 10.85900
## [129] 11.05089 10.54534 10.59663 11.03489 10.64542 11.33857 11.22524 10.49127
## [137] 10.89674 10.71442 10.66896 10.83958 11.23849 10.81978 10.40426 10.66896
## [145] 10.93311 10.73640 10.51867 10.51867 10.87805 10.54534 10.40426 10.83958
## [153] 11.08214 10.87805 10.40426 10.69194 10.57132 10.57132 10.95081 10.69194
## [161] 10.69194 10.75790 10.46310 11.00210 10.96820 11.30220 10.57132 10.40426
## [169] 10.87805 10.77896 10.69194 11.28978 10.77896 10.93311 11.33857 10.66896
## [177] 11.12726 11.05089 10.79958 10.91509 10.83958 11.69525 10.73640 10.98529
## [185] 10.96820 11.35041 11.19821 10.95081 11.14186 10.64542 11.65269 11.56172
## [193] 10.79958 11.65269 10.69194 11.27720 11.65269 10.69194 11.25156 11.60824
## [201] 11.46163 11.28978 11.36210 11.77529 11.31447 11.45105 12.10071 11.11245
## [209] 10.66896 11.38509 11.38509 11.11245 11.42954 11.91839 12.01370 11.98293
## [217] 11.38509 12.04355 11.88449 11.65269 11.65269 11.47210 11.49272 11.91839
## [225] 11.98293 12.07254 12.25486 12.23077 12.56024 12.01370 12.32386 11.73607
## [233] 12.20607 12.66033 12.20607 12.23077 12.46844 11.95118 12.42922 12.01370
## [241] 12.07254 12.32386 12.23077 12.10071 11.98293 12.27839 11.98293 12.64433
## [249] 12.77987 12.25486 13.02805 13.24458
##
## $group
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
## [38] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [149] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [186] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## [223] 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4
##
## $names
## [1] "1" "2" "3" "4" "5"
## $stats
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 7.803635 7.600902 7.600902 7.600902 7.600902 6.907755 7.600902
## [3,] 8.699515 8.517193 8.517193 8.294050 8.006368 8.006368 8.294050
## [4,] 9.998798 9.546813 9.546813 9.210340 8.517193 8.853665 9.472705
## [5,] 12.948010 12.388394 12.449019 11.472103 9.680344 11.461632 12.254863
## [,8] [,9]
## [1,] 6.907755 6.907755
## [2,] 7.600902 6.907755
## [3,] 8.294050 7.600902
## [4,] 9.305651 8.699515
## [5,] 11.775290 11.362103
##
## $n
## [1] 899 625 2965 402 31 498 1613 856 8603
##
## $conf
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 8.583839 8.394212 8.460730 8.167221 7.746346 7.868594 8.220412 8.201988
## [2,] 8.815191 8.640175 8.573657 8.420879 8.266389 8.144141 8.367687 8.386112
## [,9]
## [1,] 7.570381
## [2,] 7.631424
##
## $out
## [1] 12.91164 12.61154 12.48749 12.46844 12.56024 12.77987 12.73670 12.64433
## [9] 12.56024 13.02805 12.73670 12.57764 11.98293 12.46844 12.38839 12.32386
## [17] 13.24458 12.07254 12.48749 12.48749 12.27839 12.66033 12.23077 12.01370
## [25] 11.84940 12.18075 12.46844 12.01370 11.84940 11.84940 11.73607 11.39639
## [33] 11.50288 11.45105 11.56172 11.60824 11.40756 11.65269 11.56172 11.39639
## [41] 11.69525 11.60824 11.41861 11.39639 11.41861 11.77529 11.65269 11.41861
## [49] 11.56172 11.95118 11.84940 11.81303 11.84940 11.81303 11.77529 11.41861
## [57] 11.49272 11.56172 11.65269 11.60824 11.65269 12.23077 11.49272 11.91839
## [65] 12.10071 12.07254 12.27839 12.77987
##
## $group
## [1] 2 3 3 3 3 3 3 3 3 3 3 3 4 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
## [39] 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
##
## $names
## [1] "High/ High" "High/ Low" "High/ Medium" "Low/ High"
## [5] "Low/ Low" "Low/ Medium" "Medium/ High" "Medium/ Low"
## [9] "Medium/ Medium"
## $stats
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 7.600902 12.91164 NA 6.907755 13.24458 12.57764 6.907755 12.48749
## [2,] 7.600902 12.91164 NA 6.907755 13.24458 12.57764 6.907755 12.48749
## [3,] 7.600902 12.91164 NA 8.006368 13.24458 12.57764 8.006368 12.48749
## [4,] 7.600902 12.91164 NA 8.987197 13.24458 12.57764 9.104980 12.48749
## [5,] 7.600902 12.91164 NA 12.100712 13.24458 12.57764 12.388394 12.48749
## [,9] [,10]
## [1,] 11.8494 6.907755
## [2,] 11.8494 7.600902
## [3,] 11.8494 8.294050
## [4,] 11.8494 9.305651
## [5,] 11.8494 11.695247
##
## $n
## [1] 1 1 0 6044 1 1 9454 1 1 988
##
## $conf
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 7.600902 12.91164 NA 7.964106 13.24458 12.57764 7.970663 12.48749
## [2,] 7.600902 12.91164 NA 8.048629 13.24458 12.57764 8.042072 12.48749
## [,9] [,10]
## [1,] 11.8494 8.208358
## [2,] 11.8494 8.379741
##
## $out
## [1] 12.23077 12.20607 12.66033 12.20607 12.46844 12.23077 12.25486 12.18075
## [9] 12.18075 12.25486 12.77987 12.48749 12.94801 12.44902 12.42922 12.46844
## [17] 12.46844 12.48749 12.56024 12.73670 12.64433 12.56024 12.77987 13.02805
## [25] 12.73670 12.77987 12.61154 12.23077 12.27839
##
## $group
## [1] 4 4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7
## [26] 7 10 10 10
##
## $names
## [1] "Akinfenwa" "C. Ronaldo" "Courtois"
## [4] "Lean" "Messi" "Neymar"
## [7] "Normal" "PLAYER_BODY_TYPE_25" "Shaqiri"
## [10] "Stocky"
## $stats
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755 NA 8.294050 6.907755
## [2,] 7.600902 7.600902 8.006368 7.254329 7.600902 NA 9.210340 7.600902
## [3,] 8.699515 8.517193 8.699515 8.573858 8.517193 NA 9.546813 8.294050
## [4,] 9.680344 9.509759 9.798127 9.875202 9.544255 NA 9.947722 9.210340
## [5,] 12.468437 11.066638 12.388394 11.112448 11.849398 NA 10.341742 11.608236
## [,9] [,10] [,11] [,12] [,13] [,14] [,15]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 7.600902 7.600902 7.600902 7.600902 7.600902 8.006368 8.006368
## [3,] 8.517193 8.517193 8.517193 8.294050 8.405621 8.853665 8.699515
## [4,] 9.392662 9.740969 9.546813 8.987197 9.392662 9.952278 9.877841
## [5,] 11.982929 12.779873 11.849398 10.621327 11.695247 12.254863 12.254863
## [,16] [,17] [,18] [,19] [,20] [,21] [,22]
## [1,] 6.907755 8.699515 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 7.600902 9.210340 7.600902 7.600902 7.600902 7.600902 6.907755
## [3,] 8.150209 9.546813 8.294050 8.294050 8.517193 8.294050 7.600902
## [4,] 9.305651 9.952278 9.305651 9.472705 9.615805 9.472705 8.294050
## [5,] 10.985293 10.896739 11.608236 12.254863 12.323856 11.849398 10.373491
## [,23] [,24] [,25] [,26] [,27] [,28] [,29]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 7.254329 7.600902 8.006368 7.600902 7.600902 8.006368 6.907755
## [3,] 8.006368 8.294050 8.987197 8.699515 8.006368 8.853665 8.006368
## [4,] 9.032003 9.392662 9.903488 9.903488 9.157660 9.615805 8.987197
## [5,] 10.518673 11.849398 12.206073 13.244581 10.968198 11.775290 12.100712
##
## $n
## [1] 291 111 144 8 75 0 19 524 625 392 230 13 392 197 136
## [16] 66 21 520 625 397 223 2785 15 388 193 138 64 433 7467
##
## $conf
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 8.506914 8.230928 8.46360 7.10980 8.162643 NA 9.279529 8.182962
## [2,] 8.892115 8.803459 8.93543 10.03792 8.871744 NA 9.814096 8.405137
## [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] 8.403954 8.346412 8.314464 7.686557 8.262635 8.634614 8.445960 7.818661
## [2,] 8.630432 8.687975 8.719922 8.901542 8.548607 9.072717 8.953069 8.481756
## [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
## [1,] 9.291004 8.175932 8.175752 8.357416 8.096004 7.559398 7.281158 8.150328
## [2,] 9.802621 8.412168 8.412348 8.676971 8.492095 7.642407 8.731577 8.437771
## [,25] [,26] [,27] [,28] [,29]
## [1,] 8.771436 8.389820 7.698908 8.731461 7.968346
## [2,] 9.202958 9.009209 8.313827 8.975870 8.044389
##
## $out
## [1] 12.660328 11.211820 12.577636 12.429216 11.652687 11.775290 11.982929
## [8] 12.278393 12.230765 12.323856 12.644328 12.180755 12.487485 12.736701
## [15] 12.911642 6.907755 8.006368 11.849398 12.043554 12.013701 12.230765
## [22] 12.072541 12.560244 12.948010 12.779873 11.418615 11.451050 11.141862
## [29] 10.858999 10.757903 10.736397 10.714418 10.571317 10.571317 10.404263
## [36] 10.668955 10.518673 10.691945 10.668955 10.621327 10.434116 10.463103
## [43] 10.799576 10.596635 10.621327 10.571317 10.404263 10.491274 10.736397
## [50] 10.799576 11.066638 10.463103 10.645425 10.778956 10.736397 11.050890
## [57] 11.338572 10.736397 10.571317 11.082143 11.225243 11.461632 10.819778
## [64] 10.491274 10.691945 11.429544 11.350407 12.013701 11.362103 12.611538
## [71] 12.487485 12.230765 12.072541 12.100712 12.487485 12.230765 13.028053
## [78] 12.388394 12.180755 12.468437 12.180755 12.180755 12.230765 12.278393
## [85] 12.180755 12.779873 12.736701
##
## $group
## [1] 3 7 7 8 8 8 8 9 9 9 9 13 13 15 15 17 17 17 18 18 18 18 19 20 20
## [26] 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22
## [51] 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 28 28 28 28 28 28
## [76] 28 28 29 29 29 29 29 29 29 29 29 29
##
## $names
## [1] "CAM" "CB" "CDM" "CF" "CM" "GK" "LAM" "LB" "LCB" "LCM" "LDM" "LF"
## [13] "LM" "LS" "LW" "LWB" "RAM" "RB" "RCB" "RCM" "RDM" "RES" "RF" "RM"
## [25] "RS" "RW" "RWB" "ST" "SUB"
## $stats
## [,1] [,2] [,3] [,4] [,5]
## [1,] 6.907755 6.907755 6.907755 6.907755 6.907755
## [2,] 7.600902 6.907755 6.907755 6.907755 6.907755
## [3,] 8.517193 8.294050 7.600902 8.006368 6.907755
## [4,] 9.546813 9.305651 8.294050 9.104980 8.006368
## [5,] 12.230765 12.736701 10.373491 12.323856 9.615805
##
## $n
## [1] 1113 3632 1790 9711 246
##
## $conf
## [,1] [,2] [,3] [,4] [,5]
## [1,] 8.425035 8.231184 7.549132 7.971139 6.797084
## [2,] 8.609351 8.356915 7.652673 8.041597 7.018426
##
## $out
## [1] 12.487485 12.487485 13.244581 13.028053 10.757903 10.985293 10.621327
## [8] 10.596635 11.608236 10.878047 10.896739 10.596635 10.839581 10.571317
## [15] 10.757903 10.596635 10.571317 10.668955 10.463103 10.668955 10.621327
## [22] 10.757903 10.757903 10.596635 10.691945 10.839581 10.736397 10.621327
## [29] 10.736397 10.668955 10.571317 10.896739 10.839581 11.461632 10.691945
## [36] 10.757903 10.799576 10.518673 10.915088 11.018629 11.396392 11.813030
## [43] 11.849398 12.449019 12.660328 12.468437 12.429216 12.468437 12.487485
## [50] 12.560244 12.779873 12.644328 12.560244 12.779873 12.948010 12.736701
## [57] 12.779873 12.911642 10.126631 10.126631 9.680344 9.903488 9.903488
## [64] 9.740969 10.043249 11.050890 10.736397 10.933107 10.819778
##
## $group
## [1] 1 1 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [39] 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5
##
## $names
## [1] "Africa" "Americas" "Asia" "Europe" "Oceania"
## $stats
## [,1] [,2]
## [1,] 8.006368 6.907755
## [2,] 9.615805 6.907755
## [3,] 10.308953 8.006368
## [4,] 10.950807 8.853665
## [5,] 12.948010 11.736069
##
## $n
## [1] 1435 15057
##
## $conf
## [,1] [,2]
## [1,] 10.25327 7.981312
## [2,] 10.36463 8.031424
##
## $out
## [1] 13.244581 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755
## [8] 7.600902 6.907755 7.600902 7.600902 7.600902 7.600902 6.907755
## [15] 6.907755 7.600902 6.907755 7.600902 6.907755 7.600902 7.600902
## [22] 7.600902 6.907755 7.600902 7.600902 6.907755 7.600902 7.600902
## [29] 7.600902 7.600902 6.907755 7.600902 7.600902 7.600902 6.907755
## [36] 7.600902 7.600902 6.907755 6.907755 6.907755 6.907755 6.907755
## [43] 6.907755 6.907755 7.600902 6.907755 6.907755 6.907755 6.907755
## [50] 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755 6.907755
## [57] 13.028053 11.813030 11.849398
##
## $group
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [39] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2
##
## $names
## [1] "alta" "bajo"
## $stats
## [,1] [,2] [,3] [,4]
## [1,] 6.907755 6.907755 6.907755 11.84940
## [2,] 6.907755 6.907755 7.600902 11.84940
## [3,] 8.006368 8.006368 8.294050 12.53256
## [4,] 8.987197 9.104980 9.305651 12.91164
## [5,] 12.100712 12.388394 11.695247 13.24458
##
## $n
## [1] 6044 9454 988 6
##
## $conf
## [,1] [,2] [,3] [,4]
## [1,] 7.964106 7.970663 8.208358 11.84738
## [2,] 8.048629 8.042072 8.379741 13.21774
##
## $out
## [1] 12.230765 12.206073 12.660328 12.206073 12.468437 12.230765 12.254863
## [8] 12.180755 12.180755 12.254863 12.779873 12.487485 12.948010 12.449019
## [15] 12.429216 12.468437 12.468437 12.487485 12.560244 12.736701 12.644328
## [22] 12.560244 12.779873 13.028053 12.736701 12.779873 12.611538 12.230765
## [29] 12.278393 7.600902
##
## $group
## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 4
##
## $names
## [1] "Lean" "Normal" "Stocky" "Unico"
Vea que la reputación, pie debil, habilidad de movimiento, continente, tipo de cuerpo, afectan al salario
library(FactoClass)
## Cargando paquete requerido: ade4
## Cargando paquete requerido: ggrepel
## Cargando paquete requerido: xtable
## Cargando paquete requerido: scatterplot3d
# Tabla de contigencias edad - estrato (frec. abs. y rel.)
tc <- table( data_sin_GK$new_reputacion , data_sin_GK$Continente)
tabtc <- cbind(tc, totF = rowSums(tc))
tabtc <- rbind(tabtc, totC=colSums(tabtc))
pf <- prop.table(tc, 1)
pc <- prop.table(tc, 2)
par(mfrow=c(2,1), mai=c(0.4,1,0.3,0.1))
plotct(t(tc), "row", col=2:5)
plotct(tc, "row", col=2:4)
# Filtrar columnas categóricas
factor_vars <- names(data_sin_GK)[sapply(data_sin_GK, is.factor)]
# Iterar sobre todas las combinaciones de pares de variables categóricas
for (i in 1:(length(factor_vars) - 1)) {
for (j in (i + 1):length(factor_vars)) {
var1 <- factor_vars[i]
var2 <- factor_vars[j]
# Crear tabla de contingencia
tc <- table(data_sin_GK[[var1]], data_sin_GK[[var2]])
tabtc <- cbind(tc, totF = rowSums(tc))
tabtc <- rbind(tabtc, totC = colSums(tabtc))
# Graficar perfiles
par(mfrow = c(2, 1), mai = c(0.4, 1, 0.3, 0.1))
plotct(t(tc), "row", col = 2:(ncol(tc) + 1), main = paste("Perfil de filas:", var1, "vs", var2))
plotct(tc, "row", col = 2:(ncol(tc) + 1), main = paste("Perfil de columnas:", var1, "vs", var2))
}
}
library(corrplot)
## Warning: package 'corrplot' was built under R version 4.4.2
## corrplot 0.95 loaded
# Seleccionar solo las columnas numéricas
numeric_data <- data[sapply(data, is.numeric)]
# Modificar los nombres de las columnas para que solo contengan las primeras dos letras
colnames(numeric_data) <- substr(colnames(numeric_data), 1, 2)
# Crear la matriz de correlación
correlation_matrix <- cor(numeric_data)
# Mostrar la matriz de correlación
corrplot(correlation_matrix, method = "circle")
colnames(data[sapply(data, is.numeric)])
## [1] "edad" "altura"
## [3] "peso" "calificacion_general"
## [5] "potencial" "valor_euro"
## [7] "salario" "rank_club"
## [9] "cruce" "finalizacion"
## [11] "cabezazos" "pases_cortos"
## [13] "voleas" "regateo"
## [15] "curva" "presicion_tiro_libre"
## [17] "pase_largo" "control_balon"
## [19] "aceleracion" "spring"
## [21] "agilidad" "reaccion"
## [23] "balance" "potencia_tiro"
## [25] "salto" "resistencia"
## [27] "fuerza" "tiro_largo"
## [29] "agresividad" "intersepciones"
## [31] "posicionamiento" "vision"
## [33] "penaltis" "composicion"
## [35] "marcaje" "entrada_pie"
## [37] "entrada_deslisante" "arqueo_portero"
## [39] "manejo_portero" "patada_portero"
## [41] "posicionamiento_portero" "reflejos_portero"
## [43] "ln_salario" "ln_valor_euro"
# Instalar y cargar las librerías necesarias
library(ggplot2)
library(reshape2)
## Warning: package 'reshape2' was built under R version 4.4.2
# Seleccionar solo las columnas numéricas
numeric_data <- data[sapply(data, is.numeric)]
# Modificar los nombres de las columnas para que solo contengan las primeras dos letras
colnames(numeric_data) <- substr(colnames(numeric_data), 1, 2)
# Calcular la matriz de correlación
correlation_matrix <- cor(numeric_data)
# Convertir la matriz de correlación en un formato largo (long format)
correlation_melted <- melt(correlation_matrix)
# Colorear las correlaciones mayores a 0.7 de verde
correlation_melted$color <- ifelse(correlation_melted$value > 0.7, "green", "white")
# Crear el gráfico de la matriz de correlación
ggplot(correlation_melted, aes(Var1, Var2, fill = value)) +
geom_tile(color = "white") +
scale_fill_gradient2(low = "white", high = "blue", mid = "white", midpoint = 0, limits = c(-1, 1)) +
geom_tile(data = subset(correlation_melted, color == "green"), aes(fill = value), color = "green") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = "Matriz de Correlación", fill = "Correlación") +
theme(axis.title.x = element_blank(), axis.title.y = element_blank())